智能论文笔记

Comparative study of deep learning methods for the automatic segmentation of lung, lesion and lesion type in CT scans of COVID-19 patients

Sofie Tilborghs , Ine Dirks , Lucas Fidon , Siri Willems , Tom Eelbode , Jeroen Bertels , Bart Ilsen , Arne Brys , Adriana Dubbeldam , Nico Buls

分类：计算机视觉

2020-07-29

最近关于Covid-19的研究表明，CT成像提供了评估疾病进展和协助诊断的有用信息，以及帮助理解疾病。有越来越多的研究，建议使用深度学习来使用胸部CT扫描提供快速准确地定量Covid-19。兴趣的主要任务是胸部CT扫描的肺和肺病变的自动分割，确认或疑似Covid-19患者。在这项研究中，我们使用多中心数据集比较12个深度学习算法，包括开源和内部开发的算法。结果表明，合并不同的方法可以提高肺部分割，二元病变分割和多种子病变分割的总体测试集性能，从而分别为0.982,0.724和0.469的平均骰子分别。将得到的二元病变分段为91.3ml的平均绝对体积误差。通常，区分不同病变类型的任务更加困难，分别具有152mL的平均绝对体积差，分别为整合和磨碎玻璃不透明度为0.369和0.523的平均骰子分数。所有方法都以平均体积误差进行二元病变分割，该分段优于人类评估者的视觉评估，表明这些方法足以用于临床实践中使用的大规模评估。

translated by 谷歌翻译

On the Challenges of using Reinforcement Learning in Precision Drug Dosing: Delay and Prolongedness of Action Effects

Sumana Basu , Marc-André Legault , Adriana Romero-Soriano , Doina Precup

分类：机器学习

2023-01-02

Drug dosing is an important application of AI, which can be formulated as a Reinforcement Learning (RL) problem. In this paper, we identify two major challenges of using RL for drug dosing: delayed and prolonged effects of administering medications, which break the Markov assumption of the RL framework. We focus on prolongedness and define PAE-POMDP (Prolonged Action Effect-Partially Observable Markov Decision Process), a subclass of POMDPs in which the Markov assumption does not hold specifically due to prolonged effects of actions. Motivated by the pharmacology literature, we propose a simple and effective approach to converting drug dosing PAE-POMDPs into MDPs, enabling the use of the existing RL algorithms to solve such problems. We validate the proposed approach on a toy task, and a challenging glucose control task, for which we devise a clinically-inspired reward function. Our results demonstrate that: (1) the proposed method to restore the Markov assumption leads to significant improvements over a vanilla baseline; (2) the approach is competitive with recurrent policies which may inherently capture the prolonged effect of actions; (3) it is remarkably more time and memory efficient than the recurrent baseline and hence more suitable for real-time dosing control systems; and (4) it exhibits favorable qualitative behavior in our policy analysis.

translated by 谷歌翻译

A Machine Learning Enhanced Approach for Automated Sunquake Detection in Acoustic Emission Maps

Vanessa Mercea , Alin Razvan Paraschiv , Daniela Adriana Lacatus , Anca Marginean , Diana Besliu-Ionescu

分类：计算机视觉 | 机器学习

2022-12-13

Sunquakes are seismic emissions visible on the solar surface, associated with some solar flares. Although discovered in 1998, they have only recently become a more commonly detected phenomenon. Despite the availability of several manual detection guidelines, to our knowledge, the astrophysical data produced for sunquakes is new to the field of Machine Learning. Detecting sunquakes is a daunting task for human operators and this work aims to ease and, if possible, to improve their detection. Thus, we introduce a dataset constructed from acoustic egression-power maps of solar active regions obtained for Solar Cycles 23 and 24 using the holography method. We then present a pedagogical approach to the application of machine learning representation methods for sunquake detection using AutoEncoders, Contrastive Learning, Object Detection and recurrent techniques, which we enhance by introducing several custom domain-specific data augmentation transformations. We address the main challenges of the automated sunquake detection task, namely the very high noise patterns in and outside the active region shadow and the extreme class imbalance given by the limited number of frames that present sunquake signatures. With our trained models, we find temporal and spatial locations of peculiar acoustic emission and qualitatively associate them to eruptive and high energy emission. While noting that these models are still in a prototype stage and there is much room for improvement in metrics and bias levels, we hypothesize that their agreement on example use cases has the potential to enable detection of weak solar acoustic manifestations.

translated by 谷歌翻译

Contrastive View Design Strategies to Enhance Robustness to Domain Shifts in Downstream Object Detection

Kyle Buettner , Adriana Kovashka

分类：计算机视觉 | 人工智能 | 机器学习

2022-12-09

Contrastive learning has emerged as a competitive pretraining method for object detection. Despite this progress, there has been minimal investigation into the robustness of contrastively pretrained detectors when faced with domain shifts. To address this gap, we conduct an empirical study of contrastive learning and out-of-domain object detection, studying how contrastive view design affects robustness. In particular, we perform a case study of the detection-focused pretext task Instance Localization (InsLoc) and propose strategies to augment views and enhance robustness in appearance-shifted and context-shifted scenarios. Amongst these strategies, we propose changes to cropping such as altering the percentage used, adding IoU constraints, and integrating saliency based object priors. We also explore the addition of shortcut-reducing augmentations such as Poisson blending, texture flattening, and elastic deformation. We benchmark these strategies on abstract, weather, and context domain shifts and illustrate robust ways to combine them, in both pretraining on single-object and multi-object image datasets. Overall, our results and insights show how to ensure robustness through the choice of views in contrastive learning.

translated by 谷歌翻译

Comparison of Lexical Alignment with a Teachable Robot in Human-Robot and Human-Human-Robot Interactions

Yuya Asano , Diane Litman , Mingzhi Yu , Nikki Lobczowski , Timothy Nokes-Malach , Adriana Kovashka , Erin Walker

分类：自然语言处理 | 机器人

2022-09-23

扬声器在彼此保持一致的过程中建立了融洽的关系。在指导域材料的同时，已经证明了与教师的融洽关系，以促进学习。过去关于教育领域的词汇一致性的工作都在量化对齐方式的措施和与代理对齐的相互作用的类型中都遭受了限制。在本文中，我们采用基于数据驱动的共享表达式概念（可能由多个单词组成）的对齐措施，并比较一对一的人类机器人（H-R）相互作用的对齐方式与协作人类人类的H-R部分中的对齐方式-Orobot（H-H-R）相互作用。我们发现，H-R设置中的学生与H-H-R设置相比，与可教的机器人保持一致，并且词汇一致性和融洽关系之间的关系比以前的理论和经验工作所预测的要复杂。

translated by 谷歌翻译

Mates2Motion: Learning How Mechanical CAD Assemblies Work

James Noeckel , Benjamin T. Jones , Karl Willis , Brian Curless , Adriana Schulz

分类：计算机视觉

2022-08-02

我们描述了我们使用对CAD表示的深度学习来推断机械组件中交配部分之间的自由度的工作。我们使用由CAD零件和配偶将它们组成的大型实际机械组件的大型数据集训练我们的模型。我们提出了重新定义这些伴侣的方法，以使它们更好地反映组件的运动，并缩小可能的运动轴。我们还进行了一项用户研究，以创建具有更可靠标签的运动声音测试集。

translated by 谷歌翻译

GreenDB -- A Dataset and Benchmark for Extraction of Sustainability Information of Consumer Goods

Alexander Flick , Sebastian Jaeger , Jessica Adriana Sanchez Garcia , Kaspar von den Driesch , Karl Brendel , Felix Biessmann

分类：机器学习

2022-07-21

消费品的生产，运输，使用和处置对温室气体排放和资源耗竭有重大影响。机器学习（ML）可以通过考虑产品搜索或现代零售平台建议中的可持续性方面来帮助促进可持续消耗模式。但是，缺乏具有可信赖的可持续性信息的大型高质量公共产品数据阻碍了ML技术的发展，这可以帮助实现我们的可持续性目标。在这里，我们介绍GreendB，这是一个数据库，该数据库每周从欧洲在线商店收集产品。作为产品可持续性的代理，它依赖于由专家评估的可持续性标签。 GreendB模式扩展了著名的schema.org产品定义，并且可以轻松地集成到现有的产品目录中。我们提出了初步结果，表明接受我们数据训练的ML模型可以可靠地（F1分数96％）预测产品的可持续性标签。这些贡献可以帮助补充现有的电子商务体验，并最终鼓励用户采取更可持续的消费模式。

translated by 谷歌翻译

Revisiting Hotels-50K and Hotel-ID

Aarash Feizi , Arantxa Casanova , Adriana Romero-Soriano , Reihaneh Rabbany

分类：计算机视觉

2022-07-20

在本文中，我们建议对两个最近的酒店识别数据集进行重新访问版本：酒店50k和酒店ID。重新访问的版本提供的评估设置具有不同级别的难度，以更好地与预期的现实应用程序（即反对人口贩运）保持一致。现实世界中的场景涉及当前数据集中未捕获的酒店和位置，因此，重要的是要考虑真正看不见的评估设置，这一点很重要。我们使用多个最先进的图像检索模型测试此设置，并表明，如预期的那样，随着评估越来越接近现实世界中看不见的设置，模型的性能会降低。最佳性能模型的排名也会在不同的评估设置中发生变化，这进一步使用了建议的重新访问数据集。

translated by 谷歌翻译

Symbolic image detection using scene and knowledge graphs

Nasrin Kalanat , Adriana Kovashka

分类：计算机视觉 | 机器学习

2022-06-10

有时，图像传达的含义超出了它们所包含的对象列表。相反，图像可能会表达有力的信息以影响观众的思想。推断此消息需要有关对象之间关系以及有关组件的一般常识知识的推理。在本文中，我们使用场景图，图像的图表来捕获视觉组件。此外，我们使用从概念网络提取的事实来生成知识图，以了解对象和属性。为了检测符号，我们提出了一个名为SKG-SYM的神经网络框架。该框架首先使用图形卷积网络生成图像的场景图及其知识图的表示。然后，该框架融合了表示形式，并使用MLP对其进行分类。我们进一步扩展网络以使用注意力机制，该机制了解图表的重要性。我们在广告数据集上评估我们的方法，并将其与基线象征主义分类方法（RESNET和VGG）进行比较。结果表明，我们的方法在F评分方面优于重新连接，并且基于注意力的机制与VGG具有竞争力，而模型的复杂性较低。

translated by 谷歌翻译

FlexLip: A Controllable Text-to-Lip System

Dan Oneata , Beata Lorincz , Adriana Stan , Horia Cucu

分类：人工智能

2022-06-07

将文本输入转换为视频内容的任务已成为合成媒体生成的重要主题。已经提出了几种方法，其中一些方法在受限的任务中达到了近距离表现。在本文中，我们通过将文本转换为唇部标记来解决文本到视频生成问题的次要发音。但是，我们使用模块化，可控的系统体系结构进行此操作，并评估其每个组件。我们的标题名为Flexlip的系统分为两个单独的模块：文本到语音和语音到唇，都具有基本可控的深神经网络体系结构。这种模块化可以轻松替换其每个组件，同时还可以通过解开或投影输入功能来快速适应新的扬声器身份。我们表明，通过仅将数据的数据用于音频生成组件，而对于语音到唇部分量的5分钟，生成的唇部标记的客观度量与使用较大较大的唇部标记相当一组训练样本。我们还通过考虑数据和系统配置的几个方面，对系统的完整流进行了一系列客观评估措施。这些方面与培训数据的质量和数量有关，使用预审计的模型以及其中包含的数据以及目标扬声器的身份；关于后者，我们表明我们可以通过简单地更新模型中的嘴唇形状来对看不见的身份进行零拍的唇部适应。

translated by 谷歌翻译